Noname manuscript No. 

(will be inserted by the editor) 



Identification of Finite Dimensional Levy Systems in 
Financial Mathematics 

L. Gerencser ■ M. Manfay 



Received: date / Accepted: date 



Abstract Levy processes are widely used in financial mathematics to model re- 
turn data. Price processes are then defined as a corresponding geometric Levy 
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so-called empirical characteristic function method (ECF) originally devised for es- 
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will be obtained. Their potential to outperform the prediction error method in 
estimating the system parameters will also be demonstrated. 
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1 Introduction 

The classical model for modelling market dynamics, namely geometric Brownian 
motion, was proposed by Louis Bacehelier [T|. This model is still the accepted core 
model despite the fact that empirical studies revealed that its assumptions are not 
realistic. For example, since price movements are induced by transactions which 
can be unevenly distributed in real time, it would be more natural to use a time 
changed Brownian motion to model price dynamics. If the time change is defined 
by a gamma process, we obtain the so-called VG (shorthand for Variance Gamma) 
process. VG processes reproduce a number of stylized facts of real price processes, 
such as fat tails and large kurtosis. It can be shown that the above time changed 
Brownian process itself is a Levy process. Extending the above construction novel 
price dynamics have been proposed by a variety of authors, called the geometric 
Levy processes obtained by exponentiating a Levy process. 

A Levy process (Zt) is much like a Wiener process: a process with stationary an 
independent increments, but discontinuities or jumps are allowed. A good survey 
paper on Levy processes used in financial modelling is the paper by Miyahara and 
Novikov, [18] ■ [H] studies several problems arising in the field of exponential Levy 
processes. For an excellent introduction to the theory of Levy processes see [3]. 
A key building block in the theory of Levy processes is the compound Poisson 
process. A more general class Levy process is formally obtained via 



where N(dt, dx) is a time-homogeneous, space-time Poisson point process, counting 
the number of jumps of size x at time t. In this case Zt is a pure jump process, 
which paradoxically means that the Levy-Ito decomposition of Zt does not have 
a Brownian motion component (but it may have a drift term). The intensity of 
N(dt,dx) is defined by E[N(dt,dx)], which is due to time homogeneity can be 
written as 

E[N(dt,dx)] =dt-v(dx), 

where v(dx) is the Levy-measure. The above representation given in (QJ is math- 
ematically rigorous if 



Under this condition the sample paths of Zt are of finite variation, a property 
supported by empirical evidence for most indices as emphasized in [B]. The char- 
acteristic function of a Levy process can be written in the form 




(1) 




(2) 



E e' 



iuZt 



(3) 



where ip(u) is the characteristic exponent. 

The standard model of a price process within this framework is then 



S t = S expZ t , 



(1) 



and (St) is called a geometric Levy process. A variety of choices for (Zt) has 
been proposed in the literature: it can be a stable process, a variance Gamma 
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(VG) process, a tempered stable process, a special case of which is the (CGMY) 
process, a hypergeometric process or a Normal-inverse Gaussian (NIG) process. 

The motivation behind these models is the assumption that the returns of the 
stock process, say (S t+ h — St) /St are independent and stationary. While this is an 
attractive assumption, its consequences are less attractive. In particular it follows 
that the variance of the price process tends to infinity, which is certainly unnatural 
for, say, prices of agricultural products. A closer look at data in fact reveals that 
there is a weak correlation between daily returns {St+i — St)/St- For example, 
considering data on IBM Coca Cola stock prices in a period of 20 years from Nov 
1990 to Nov 2010 we found for the correlation coefficients of daily log-returns Xt 
that 



This small, but non-negligible, negative correlation calls for a refinement of the ex- 
ponential Levy model, allowing memory in the daily return process. An intuitive 
empirical argument can also be given in favor of the need for memory: namely 
an overreaction of the market is generally followed by a correction, resulting in 
a correlation between daily returns. The recently much studied popular Geomet- 
ric fractional Brownian motion model gives return process with non-independent 
increments, for more details on fractional Brownian motion see papers of T.E. 
Duncan, for example |16j . 

We propose to introduce a new class of models, using the methodology of linear 
system theory, to capture the presence of decaying memory. The infinitesimal 
increments of the logarithm of the price process will be defined as a process dYt 
which is the output of a finite dimensional stable linear SISO (shorthand for single- 
input-single-output) system, driven by a Levy process: 



where A represents the linear mapping from input to output, and Z is a Levy 
process. For the sake of convenience we let — oo < t < +oo. In the case of a finite 
dimensional stable linear SISO system the mapping A can be described by a set 
of state-space equations, a well known example of such systems is defined by: 



corr(X t ,X t _i) = -0.135. 



dY t = AdZ t , 



dX t = HXtdt + dZ t 
dY t = LXtdt + dZ t 



(5) 
(6) 



From the above equations we get 





The inverse system is formally obtained as 



dX t = (H - KL) X t dt + KdY t 
dZ t = dY t - LdX t . 



(8) 
(9) 



It is assumed that both systems A and A~ are exponentially stable, equivalently, 
we assume that both H and (H — KL) are stable matrices. Such a system will be 
called a Levy system. 
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The inverse filter has the following form: 

dX t = (H- KL) X t dt + KdY t (10) 

de t = dY t - LdX t . (11) 



Having defined the infinitesimal increments of the logarithm of the price process 
we define the price process according to (J^EJ) : 

5" t = S expY t . 

In the statistical analysis of such systems, both the system dynamics and the 
fine characteristics of (Z±) are to be identified. The first difficulty of applying a 
maximum-likelihood (ML) method lies in the fact that there is no natural reference 
measure in the space of sample paths. In addition, the computation of the Radon- 
Nikodym derivative is practically not feasible since /_„ e H ^~ s ^ dZ s is not even a 
Levy process. 

To avoid this problem we consider an alternative discrete-time model class, 
where the daily log-returns Ay n are defined via a discrete time finite dimensional 
system 

Ay n = A AZ n , (12) 

where A represents the linear mapping from input to output, and AZ n is the 
increment of a Levy process Z over an interval [(n — l)h,nh), with some fixed 
h > 0. For the sake of convenience we let — oo < n < +oo. A state space equation 
for this model is given by 

AX n+1 =HAX n + AZ n (13) 
AY n = LAX n + AZ n . (14) 

We will call this model a discrete time finite dimensional Levy system. Assume that 
A = A(9*) where 6* is an unknown parameter- vector, and similarly, let v(dx) = 
v(dx, jf), where rf denotes an unknown parameter- vector. The ranges of of 9* and 
jf are assumed to be known. The fundamental problem to be discussed in this 
paper is to identify this system and to establish sharp results for the error of the 
estimator. 

If we knew the probability density function of the noise AZ n then we could ap- 
ply an ML (Maximum Likelihood) estimation method, and establish sharp results 
for the estimation error, see [9] . The challenge of the present problem is that it is 
the characteristic function of the noise that is explicitly given. A natural approach 
to solve this problem is to combine techniques of system identification with the 
empirical characteristic function (ECF) method widely used in finance to analyze 
i.i.d. data. Before going into further details we present a few examples of Levy 
processes used in finance. 



2 Levy processes in finance 

To model the increments of the logarithm of a price process a wide range of 
geometric Levy processes has been proposed by a variety of authors. Mandelbrot 
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suggested to use a-stable process to model the price dynamics of wool, see |15j . 
An a-stable with < a < 2 is defined via the Levy measure 

v{dx) = C~\x\~ 1 ~ a l x<0 dx + C + \x\~ 1 ~ a l x>0 dx. 

A recently widely studied class of Levy processes is the CGMY process due to 
Carr, Geman, Madan and Yor [10]. It is obtained by setting C~ = C + , and then, 
separately for x > and x < 0, multiplying the Levy-density of the original 
symmetric stable process with a decreasing exponential. The corresponding Levy- 
measure, using standard parametrization, is of the form: 

Ce -G\x\ Ce" Mx 
vida) = |g|1+y l x<0 dx + Y l x>0 dx, 

where C,G,M > 0, and < Y < 2. Intuitively, C controls the level of activity, G 
and M together control skewness. Typically G > M reflecting the fact that prices 
tend to increase rather than decrease. Y controls the density of small jumps, 
i.e. the fine structure. For Y < 1 the integrability condition ([2| is satisfied, thus 
corresponding Levy process is of finite variation. The characteristic exponent of 
the CGMY process is given by 

ip(u) = Cr(-Y) ({M - iu) Y - M Y + {G + iu) Y - G y ) , (15) 

where r denotes the gamma-function. 

Allowing C and Y to take on different values for x > and i<0we get a 
more general class of processes called tempered stable process, see cite. 

Formally setting Y = we get the Levy density of the so-called Variance 
Gamma process (VG for short) that has been proposed by Madan, Carr and Chang 
|14| . The VG process is a time changed Brownian motion when the time change is 
a gamma process, which itself is a Levy process, obtained by properly extending 
the definition of the inverse of a Poisson process from natural numbers to positive 
reals. Thus we can write 

VG(t) = We, a {l»At)), 

where Wg a (t) = Ot + aW(t), with W being the standard Wiener process, and 7 is 
a gamma process with mean rate fi, and variance rate v, see [14] , 
Its characteristic function is given by 



l PVG{t)( u ) = - i u ® v + u 2 o 2 v/2 S j 



-t/v 



This can be obtained by a formal limiting procedure taking into account the char- 
acteristic exponent given by (2.1) and taking Y — >■ 0. 

The knowledge of the explicit form of the characteristic function is a common 
feature of distributions in finance. This is the case for tempered stable and related 
processes, see [5]. We will focus on the CGMY process. 
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3 Discrete time Levy systems 

A discrete time finite dimensional Levy system is defined as 

Ay n = A{6*)AZ n , (16) 

where AZ n is the increment of a Levy process Z over an interval [(n— l)h, nh) with 
E [AZ n ] = 0, a property to be removed later and h > is a fix sampling interval. 
The Levy-measure of Z will be denoted by v(dx) = v(dx,if), where n* denotes an 
unknown parameter- vector, for example for a CGMY process rf = (C,G,M,Y). 
The range of rf" is assumed to be known. 
Condition 1 We assume that 

/ \x\ q v(dx) < +oo (17) 

J\x\>l 

for all 1 < q < Q with some constant Q. 

Note that Condition 1 holds with Q = oo in our benchmark examples. Let D p 
and Dp be compact domains such that p* £ Dp C int D p and D p C G p . 

Condition 2 A(B) is assumed to be exponentially stable and exponentially 
inverse stable for 9 £ Gg C M. p , where Gg is a known open set. 

A system is exponentially stable if all the eigenvalues of A have strictly negative 
real parts. The application of the ML method would solve the full identification 
problem along standard lines, assuming that the density function of AZ n is known, 
see [21], which is unfortunately not the case. The objective of this paper is to 
present a combination of advanced techniques in systems identification with a 
specific statistical technique, widely used in the context in finance, called the ECF 
(shorthand for empirical characteristic function) method. The ECF method was 
originally designed for i.i.d. samples and A. Feuerverger and P. McDunnogh |13j 
showed that it can be interpreted as the Fourier transform of an ML method. 



4 Three identification problems 

In this section we formulate three identification problems related to discrete-time, 
finite dimensional Levy systems, and sketch a possible path to their solution. The 
first, simplest problem is seemingly of mere technical interest: 

Known system, parameters, unknown noise parameters. In this case define and 
compute 

e n {6*) = A-^e^Ayn = A~ 1 {e*)A{6*)AZ n = AZ n , 

assuming, for the sake of simplicity, that AZ n = £«(#*) = for n < 0. After that 
we can apply the ECF method for i.i.d. samples to obtain the estimation of rf . 
This simple solution will be the base of the identification method presented in 
Section [3 

Known noise parameters, unknown system parameters. This is the simplest, tech- 
nically interesting and non-trivial problem. If we knew the probability density 
function of the noise, say /, we could obtain the maximum likelihood estimate of 
9* via solving 

N 

J2fe{en(e),n*)=0, (18) 

71=1 
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where 

e„(0) =A- 1 (0)Ay„ (19) 

is the estimated innovation process of a SISO system, see [21] . 

Under certain conditions the asymptotic covariance matrix of the ML estimate 
8 N is 

where 

11 ^ [{ f(AZ n , v *) J _ ' 
with /' being the derivative of / w.r.t the first variable and 

R* = lim E \e n g{e*)el e {e*)\ . 

In our case, the p.d.f. of the noise distribution is not known. One might apply the 
prediction error method to estimate the system dynamics, i.e. 8* . However, we will 
show, in the case of CGMY noise, that we may estimate 8* in a more efficient way 
using an appropriate adaptation of the ECF method. In fact, this result is a special 
case of a more general result obtained for the general problem to be described in 
the next subsection. 

Both the system parameters and the noise parameters are unknown. The first 
method that we propose is quite straightforward: we estimate the system parame- 
ters using a PE method, then, using a certainty equivalence argument, we estimate 
the innovation process by inverting the system using the estimated parameters. 
Then, we estimate the noise parameters using ECF method for i.i.d. sequences. 
This method will be studied in Section [7] 

The second method, which is the main subject of this paper, estimates both 
the system parameters and noise parameters using an ECF method. First, an 
parameter-dependent, estimated innovation process e n {8) is defined, then the char- 
acteristic function of the noise is fitted to empirical data defined in terms of e n {8). 
Thus we get a score function that depends on both 8 and n. 

The third method applies an extension of the ECF method using the blocks of 
the time-series of unprocessed data {Ay n }^L . More details can be found in the 
Discussion. 



5 Single term ECF method 

The ECF method has been widely used in finance as an alternative to the ML 
Method, assuming i.i.d. returns [7], [8], [TTJ. We adapt this technique to the prob- 
lem of identifying the discrete-time Levy system described in (|12[) . Fix a realization 
of A in its innovation form, i.e. assume that A and its inverse are exponentially 
stable. The estimated innovation process {e n {8)) is defined via the inverse filter: 



dX t = {H- KL) X t dt + KdY t 
det = dYt — LdXt, 



(20) 
(21) 



s 
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for continuous time models. For discrete time Levy systems we define the innova- 
tion process by 

e n {0) = A- 1 (6)Ay n , (22) 

with zero initial conditions and 9 6 D p . Let denote the stationary solution 

of (|22|) when — oo < n < oo. In general, the notation (.)* will be used throughout 
this paper if the corresponding stochastic process is obtained by passing through 
a stationary process through an exponentially stable linear filter starting at — oo, 
as opposed to initializing the filter at time with some arbitrary initial condition, 
which is typically zero. Then we have for n > 

e*(0) =en(0)+r„, (23) 

where r n = o9 I (a n ) with some < a < 1, meaning that for all 1 < q < Q 

We will use this notation in a more general way: 

Definition 1 For a stochastic process X n , and a function / : Z — > E + we say that 

Xn = 0%(f(n)) 

if for all 1 < q < Q 

¥}/ q \X n \ q 



sup ' . < oo 

n fin) 



holds. 



The score functions to be used following the basic idea of the ECF method are 
defined as 

h n (u-e,r,) = e me ^ e) -<p(u,r)) (24) 
h* n (u;e,r,)=e iu <W (25) 

with u G K. These are indeed appropriate score functions, since we obviously have 

E [hn(u;9* ,rj*)] =0, 

and 

h n (u; 9* ,rf) = hn(u; 9* ,rj*) + 0^(a„). 

While h n is the function that can be computed in practice, h„ is easier to handle, 
because its stationarity. Following the philosophy of the ECF method take a fix 
set u^-s, and define the fc-dimensional vector 

MM) = (h n (u 1 ;9,ri),...,h n (u k -9,T])) T . 

Let K > be a fixed symmetric, positve definite k x k weighting matrix. Since the 
system of equations 

MM) = n=l,...,N 

is overdetermined we seek a least-square solution. Therefore we define the cost 
functions as 
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JV 

V N = V N {0,rj) = \K- 1/2 hn(e,r,)f 

n=l 

N 

v* N = v* N {e^) = Y J \K- 1/2 K l {e^)Y 



n=l 

and by solving 



v N0 (e,v) = o (26) 

V Nv (e,r,) = (27) 



we obtain the estimation 9^ and t)n of 9* and rf , respectively. 



6 Analysis 

Differentiating Vjy w.r.t 9 and n we get the equations 

Vh(e,v) = (ti%(0,v)K~ 1 h* n (e,ri) + h£*(M)*'~ 1 'i i n*(M)) = 0, (28) 
n=l 
N 

Vn v (8,v) = (hT*(9, V )K~ 1 h* n (9, V ) + hT*(9, V )K- 1 h* nn (e, V )) = 0, (29) 
n=l 

where h is the conjugate of h. Note that, setting 6 = 9*, the second equation is 
just the optimality condition of the ECF method for i.i.d. samples [8]. As for the 
first equation, the derivative of the score function h with respect to 9 is 

h n g(u,9, V ) = e me ^ e \ue n g{9). (30) 

Hence in the first equation h n g(9,rf) and h n (8,n) are not independent. However, 
the next lemma shows that their stationary approximation, h* l g{9,rf) and hn{9,ri), 
are uncorrelated. 

Lemma 1 For any r\ we haveE[V^ g {9*,r])} = 0, and in addition E [V^ v (9* , 77* )] = 0. 
Proof Consider the n th term in (|28|) . We have 

e ^jf- 1 ^**,^) = 

= £ K i> {(e U ' <in ineUe*)) (e—<^ - ^(-n m ,r,))] . (31) 
Compute the (l,m) term using the tower law: 

= E [E [(e»<< r WUO) (e- ram£ " (r) - p(-«m,»/)) 1^4]] , (32) 
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where &n—i = a {AZ^ : k < n — 1} . Here we used that ip(u, 77) = <p(— u, if). Due to 
the fact that e r \g{8*) is &n—i measurable, f|32|) can be written as 

E [iu l£ * ne (e*)E [ e ^"«™)<( e *) - e m ^ B '\{-u m , n )\^ x 

E [ iu l£*ne{°*) {f{ u l ~ %. V*) ~ <P(UI,V*)<P(-Um, V))] = 

(<p(ui - u m , rf) - tp(ui,r)*)(p(-u m , r])) E [m;e* e (0*)] = 0. 

To reduce the last equation we used that E [AZ n ] = 0. 
Similarly for the n th term of (I29|) we have 

hn V (u,8,ri) = -<p n (u,r)), 

which is non-random implying that 

E fC(«, 0* , rt^K^hUu, e*, V *)} = 0. 



The previous lemma also shows that the gradient of Vn(8, r)) serves as an 
alternative score function. The following corollary is implied by the fact that 

^hle{6* ,ri)K- r hn{e* ,rj)\ = E[h*/ s (8* , V )K~ 1 h*(0* ,v)} + 0®(a n ). (33) 
Corollary 1 For any rj we have E \Vnq (6* , 77)] = 0®j(a N ), and in addition 

E[v Nv (e*,r,*)]=o%(a lf ). 

Define p = (6,7]), and define the asymptotic cost function by 

w{e,n) = w(p) = e|a'- 1 / 2 / 1j *( p )| 2 . 

Condition 3 The equation W p (p) = has a unique solution in D p . 
A crucial object is the Hessian of W at p = p* : 

R* = W PP {p*). 

It is easy to see that 

V w nn ( v *) 

is block diagonal matrix. 

The following result provides a precise characterization of the estimation error: 



Theorem 1 Under Conditions 1,2 and 3 we have 

p N ^p* = ^R*)-^V Np {p*) + Q J (2 ^\N^) 

First, we prove some lemmas that will be used in the proof of Theorem [1] For 
the definition of L-mixing processes and for other corresponding definitions and 
theorems see the Appendix. 
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Lemma 2 Under Conditions 1,2,3 processes £n {S),£ n g(9) and e n eg(9) are L -mixing 
uniformly of order Q. 

Proof First, note that since Ay n = J2l=o a i(@*)AZi, holds, Ay n is a linear combi- 
nation of L-mixing processes of order Q. Using the fact that an uniformly expo- 
nentially stable filter with L-mixing input produces an uniformly L-mixing output 
|20j we get that Ay n is L-mixing processes of order Q for each n. The innovation 
process and its derivatives with respect to 9 can be written as 

e„(0) = A~ 1 {9)Ay l 
e ne (B)=A e 1 (e)Ay l 
£ n gg(0)=Ag e 1 (e)Ay l . 

Again, since A~ l (9) and its derivative with respect to 9 are uniformly exponentially 
stable we conclude the lemma. □ 

Lemma 3 Suppose that Conditions 1,2,3 hold. Then for any given d > the equation 
Vjvp(p) = has a unique solution in D p and it is in the sphere S = {p : \p — p* | < d} 
with probability at least 1 — 0(N~ S ) for any < s < Q/2. Furthermore the constant 
C in 0(N~ S ) = CN~ S depends only on d and s. 

Proof First, note that since £n,s n g and s n gg are L-mixing processes uniformly of 
order Q, the processes h n ,h np and h npP are L-mixing uniformly of order Q/2 as 
well. It follows that the process 

UnO>) = §~ (hlip^^hnip)) (p) - W p (p) (34) 

and its derivative with respect to p are L-mixing uniformly of order Q/2. 

E [«n(p)] = implies E [it Tl (p)] = Oj^ (a n ) uniformly in p and hence following 
Theorem [9] we have for 



5V Np = sup 



W V Np (p) - W P {p) , (35) 



5V Npp = sup 



v Npp (p)-w pp (p) 



(36) 



SV Np = o^/^^Hn- 1 / 2 ) and 8V Npp = Q J (2{p+q)) (TV" 1 / 2 ). Thus, 

P(SV Np >d) = 0(N~ S ) 

with any < s < Q/(4p + 4g) by Markov's inequality. Applying the same argument 
yields 

P(SV NpP > d") = 0(N- S ), 

for any d" > 0, and any < s < Q/(4p + 4q). 

Suppose now that equation Vjy p (p) = has a solution outside S. Define 

d' = M{\W p (p)\ : P £D p ,\p-p*\ >d} >0, 
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since W p is continuous and D p is compact. It follows that SVpj p > d! , and we have 
seen that this event has probability 0(N~ S ). So for fl^ = [SVn p > d,SVN pp > d'j 
we have 

p(n N ) > i-o(n~ s ) 

with any < s < Q/2. The equation W p (p) has a unique solution in D p . Hence by 
using the implicit function theorem, see Theorem [TO] one can easily conclude that 
Vjfp(p) = has a unique solution if d' and d are sufficiently small. 
□ 

Lemma 4 Under Conditions 1,2,3 we have pjy — p* = O^J 2 (JV -1 / 2 ). 
Proof We have 

= V Np (p N ) = V Np (p*) + V Np p {p~p*), (37) 

where 

V NpP = / V Npp ((1 - A) p* + Xp N ) dX. 
Jo 

Since 

e n (9*) = AZ k + 0f I (a n ) 

for some \a\ < 1, and with u n = \h n (p)K^ 1 ^ 2 \ using the inequality in Theorem 
[7] with f n = 1 and q < Q from 

N 1 

E 1/9 J2 Un <C q N 1/2 M^ /2 (u)ry 2 (u) (38) 

n=l 

we conclude V Np {p*) = ( j( 2 {N 1 / 2 ). Let 



W Npp = [ Wp P ((1 - A) P * + \p N ) dX. 
Jo 



VP' is a smooth function, hence 

\\W P p (p* + X (p N - p*)) -W pp (p*)\\ <c\ PN -p* \ <cd. (39) 

Clearly W pp (p*) is positive definite, hence Wn pp > cl, with some c > 0. Since on 
fi N 



1_ 

N 



— Vn pp - Wn pp 



<d' 



holds, choosing d' sufficiently small yields 



Amin ( JjV Np p ) > c (40) 



on i?jvi where A m ; n (Af) denotes the smallest eigenvalue of M. Thus 1 1 V jvpp 1 1 < 
cN^ 1 on J2/v- Then using (|37p we get that 

Xn N (p N - P *)=0 Q M /2 (N^ 2 ). 
Furthermore, since P(n^) = 0(N~ S ) for any < s < Q/2, the lemma follows. □ 
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Now we are ready to prove Theorem [T] 

Proof Using the previous lemma one can improve (|39[l : 

\\W PP (p* + \(p N -p*)) -W PP {p*)\\ <c\p N - P *\ =0 Q J 2 {N- 1 ' 2 ), 
and after integration with respect to A we get 

\\W Npp -W pp (p*)\\ =0 Q A j 2 {N- 1 ' 2 ). 



Since 8V Npp = 0® /{2{p+q)) (N^ 1 / 2 ), it implies 

N ■' n pp — yy pp 



1 v Nnn -w, 



Hence by triangle inequality from (|4ip and (|42|1 



jjV Npp - W pp {p*) 



O 0/(2(p+9))^-l/2j 



Vn pp -±W pp \p*) 



Q/m P+q)){N -3/2 y 



follows. From (|40[) and (|43[) we get 

Xa b 
Finally, 

xn N (pn - P*) = -Xn N V 'n pp Vn p (p*) 
-Xn N (^W-\p*) + oT (P+q) \N-^ 2 )) V Np { P *) 



(41) 



(42) 



(43) 



(44) 



x nN iw pp \p*)V Np ( P *)+0^ 2 ^(N-') 



N 

Since xa N = 1 — OjJ/(-W _s ) for any < s < Q/(<ip + 4g), from the last expression 
reads as 

- {R*)- 1 ^V Np { P *)+0 Q J^ +q)) <^ 



'(jv- 1 ) 



□ 



The following theorem provides an explicit expression for the Hessian of W : 
Theorem 2 Under Conditions 1,2,3 we have 

(W ee (e*) 
V W m {rf). 



R* 



i.e. R* is block diagonal, and here 



w ee (e*) = wE\ s * ne (e*)ell(e*) 



with 



and 



K l,m{( u f + M mM w Z, ? 7*M-' lt TO> J ?*) - ( u l ~ ~ "ro, V*)), 



(Wnn)j,j'(v*) = K i,m{ ( Pni( u h'n*)Vr,' j (-'Um,v*) + <Pr,' j {ui,v*)'Pvj(- u m,'n*)). 



I m=l 
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Proof First let j,f < dim 6 = p, then an entry (R*)j y of R* is 

J J l,m=l 
Carrying out differentiation yields 



+E 



+E 



(e-™-< (r) (-^) 2 < e ,(^) < e .,(0 + e-*«-<( n (-iu m ) 

Now we use, like in the proof of Lemma [T] the tower rule and that e* e (0*) is 
measurable and that E \e* n9 . (<?*)] = E e^ j( (0*)1 = E [< e . e ., (0*)] = 0. The 
previous formula reads as 

E (0*)] E?,m=l ^7™ ~ Um ^*) ~ V(ui,V*)f(-Um,V*)) (~uf) + 

2E [e* nd . (e*)e* n6 ., (9*)} Ef, m =i *i7>(«J ~ w ™> O («««»•) + 
E (**)] ELi ^ (<p(«J - "«.»/*) - i7*)v(-«m,»/*)) = 

E[4 % (^)< e .,(e*)] x 

E;, m= l ^77m (( u f + W TO)¥ 3 K)^*)V(-Wm,»7*) - (Uj - W m ) 2 ^(w ; - tiro,??*)) 

To double check the result note that the last formula gives real matrix since con- 
jugation doest not modify the value of the double sum. 
If j < P < j' < P + 1, then (R*)j y equals to 



E 



n=n" 



= 0, 



because the differentiation with respect to r]ji yields a non-random constant of 
the form <Pt^(u,ti*) and the differentiation with respect to 8j yields the term 

E[e iu <^iue* n6 .(e*)] =0. 
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Finally, if p < then {R*)j y equals to 

77=77* _ 

Td,m=X K l,m {Vrij (uhV*)^ {-u m , r)*) + ^ ( u h V*)<Pvj • 
To sum it up, 

R * = (W ee (0*) \ 

V w m {rf)) 

is block diagonal matrix, where 

w ee {e*) =v[s* ne {o*)&*)] x 

J2l,m=l K lX {( u f + u m )ip(ui,rj*)ip(-u m ,r,*) - (ui - u m ) 2 ip(ui - u m ,rj*)) , 

and 

k 

l,m=l 

□ 

Remark 1: Note that the expression for {Wrtri)jj' (jl*) l& identical to what we 
would obtained for i.i.d. samples following [7|. 

Remark 2: Since we have w > 0, the expression for w yields a non-trivial 
inequality for characteristic functions. 

The next step in calculating the asymptotic covariance matrix of 8pj is the 
computation of S* = Cov (V^ e (p*), Vjy S (p*)). For this we need to introduce the 
following auxiliary function: 

F(a,b,c,d,rj) = 

ab[ip(a + b + c + d, 77) — tp(a + b + c, rj)tp(d, 77) — 
ip(a + b + d, rj)(p(c, 77) + ip{a + b, ri)tp(c, v)<fi{d, rj)] . 



Theorem 3 Under Conditions 1,2,3 we have 



S - Cov (V m (p ),V m (p )) - ^ Q Cov (y^(p*),V^ v (p*)) J ' 



where Cov (V£ e (p*), V£ e (p*)) = s E [e* ne {e*)e^{6*)\ , wtth 

N 

l,m,s,t=l 

[F(ui,u s , —U m , —1H, 7?*) + F(m, —Ut, —Um, U s ,rj*) + 
F(—U m ,U s ,Ui,-Ut, 77*) + F(-U m , -Ut, Ul, U S , 77*)] , 
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and 

N 

Z,Tn,s,i=l 

(<PVj( u l> r l*)<Pv j i (u s ,r]*)ip(-u m - u t ,rf) + tpvjiui,?]*)^, (-ut,r)*)<p(-u m + u s ,ri*) + 

The proof of the last theorem is a simple calculation like the previous one 
and the proof uses that E [ S * n0i (e*)e* m6i (e*)] = E [e [e* n6i (e*)e* m6i (0*) |^] ] = 

E [4 9i (<n [E [4« ( r)l^]]] = for m > n. 

The proof follows the line of arguments for Lemma[T] We note that calculations 
are considerably simplified if we take K = I. Note that both R* and S* are of the 
form c Ep, where £p is the asymptotic covariance matrix for the prediction error 
method, see below (|46|) . and c is a constant. The last two theorems and Theorem 
[T] together gives an exact formula for the asymptotic covariance matrix of the 
estimator. 

Theorem 4 Under Conditions 1,2 and 3 the asymptotic covariance matrix of the ECF 
estimator for 9* can be written as 

E E = (R*y 1 S*(R*y 1 = 4^ Z P , (45) 
where the s and w are given in Theorems^ and\^ 



7 Combining PE and ECF estimators 

In this section we estimate the dynamics in a natural way and then we estimate 
the noise parameters using the ECF method. We identify 0* using only the or- 
thogonality of AZ by applying a prediction error method. This way we get an 
estimation 8pj of 9*, without using the characteristic function of AZ. Then we 
apply an ECF method with the score function 

h n (u, 77) = e lue ™( e «) _ ^(u^ 77) 

to estimate rf . 

First, we define the estimated innovation process as in the previous sections. 
The prediction error method is obtained by minimizing the cost function 

1 N 

n=l 

In practice the estimated §n is defined as the solution of 

JV 

Vp,iV0(0) = Een(0)e„0(0) = O. 

71=1 

The asymptotic cost function associated with the PE method is defined as 
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W P {9) = - lim Ee 2 n (8) = -Ee* 2 (0), 

Z n— >oo z 

recall that £^(0) is the innovation process that is calculated with stationary initial 
values. We have 

W P . 9 {6*) = and R* P := W P>e0 (9*) = E [ S * n9 (9*)e% (0* )] . 
The asymptotic covariance matrix of the PE estimate of 8* is given by 

27p=(e[< 9 (0^(0]) -1 - (46) 
An ideal score function for the ECF method to estimate 77* would be defined by 

h op t,n(u, 77) = e'" E ° (9 ) - <p(u, 77). (47) 

Since we are not given 9* we define an alternative, ^-dependent score function via 

h n {u,9,rj) = e mE "' 9 ' - tp{u,rf). 

These are appropriate score functions since E 9* , 77*)] = 0. 

Fix a set of real numbers ui, • • • , u/-, with Nk > dim 77 and define 

hn(9,rj) = (h n (ui,0,ri), ■ ■ ■ , h„(u k ,8,ri)) T . 

Then we obtain the estimate f/^ of 77* by finding a least squares solution to the 
over-determined system of equations 

h n (9 N ,v) = 7i = l,..., TV 
More precisely, define the ^-dependent cost function 

N |2 

VeM 9 >v) = \K- 1 hn(9, V )\ , 

n=l 

where if is a symmetric, positive definite weighting matrix. Then we obtain the 
estimate 777V of 77* by minimizing Ve.n{^NjV)- 

Define the (^-dependent) asymptotic cost function as 

W E (9, V ) =E^K-^ 2 h* n (G,v)\ 2 ■ 

Let its Hessian w.r.t. 77 at 77 = 77* be denoted by 

R* E = W E , m {e*,r 1 *). 

To formulate our result we need some technical conditions. Conditions 1 and 
2 have been already presented in Section [3] Let p be the joint parameter i.e. 
p = (0,77). Let Dp and Dp be compact domains such that p* 6 Dp C int D p and 
Dp C G p . 

Condition 3' The equations Wpg{9) = 0, and We^^*, 77) = have a unique 
solution in Dp. 

The following lemma, with minor variation, can be found in [19] , 
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Lemma 5 Under Conditions 1,2,3' we have 9 N - 9* = (N^ 1 / 2 ). 

Our next result characterizes the estimation error of the ECF method for the noise 
parameter rf . 

Theorem 5 Under Conditions 1,2 and 3' we have 

fj N -r,* = -(i&r^VW,,*) + Q J^ + ^(N- 1 ). 

The proof is obtained by the very same methods as Theorem [1] combined with 
the fact that 

||WV/(0W) " W m {0 N ,rf)\ =0 Q J 2 {N- 1 ' 2 ), (48) 

which is implied by § N - $* = 0^/ 2 (N~ 1/2 ). Equation (gHJ) and equations (gTJ), 
(|42|) together imply (gg} . 

8 Efficiency of the single term ECF method 

In view of the efficiency of the ECF method for i.i.d. samples the question arises 
what can be achieved by the proposed adaptation of the ECF method when iden- 
tifying the dynamics of a linear stochastic system. We do not have an answer to 
this general question, but we will show that the commonly used PE method can 
be outperformed by an appropriately calibrated ECF method when the noise is 
CGMY. Without loss of generality we may assume that 

Vai{AZ„) = 1. 

Surprisingly, we will see that the ECF method may outperform the PE method by 
using a single u sufficiently close to 0. Letting u tend to the asymptotic covariance 
of the ECF estimate tends to the asymptotic covariance of the PE estimate. On 
the other hand, numerical investigations show that increasing the number of u-s 
used in the ECF method may not improve the efficiency significantly. 

For k = 1 the asymptotic covariance of 9jy obtained by the ECF method is 
limN^^ iVCov((9/v — #*), , which reads as, using Theorems [3] and [2 

(e [ e ;„(«>S(»-)]r (-£ (M + - »))) ' 

Recall that the asymptotic covariance of 9^ obtained by the PE method is 
Z P = (E [e n g(9*)el e (9*)\y 1 . 

Thus the ECF estimator outperforms the PE estimator if 

_s_ = _J_ U{2u) + <p{-2u) _ 2 \ < 1 
w 2 Au 2 \ip 2 (u) ip 2 (—u) ip(u)(p(—u) J 

Theorem 6 For all u ^ 0, sufficiently close to we have -|j < 1, and thus the 
corresponding single-term ECF estimator of the system parameter 9* , with k = 1, 
outperforms the PE estimator. 
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Proof First note that for 

g{u) = _ ( 4^ + 4=m _ \ 

g{u) = g{u) holds, so g is a real- valued function. Let us compute the Taylor expan- 
sion of g around 0. The first three derivatives of ip(u) for a CGMY process with 
zero expectation are given by 

<p(Q) = 1, 
tp v (Q) = iE [AZ n ] = 0, 

<Puu(Q) = -E [(^Z n ) 2 ] = 

= -CT(2 - Y) (m y ~ 2 + G Y ~ 2 ) = -1, 
¥w(0) = -iE [{AZ n fj =0. 

After a lengthy computation, that we omit, we get that 

g{u) = -Au 2 + |g" 2 (F - 2)(y - 3)^ 4 + 0{u 6 ). (50) 

o 

Thus 

s 1 / ifi(2u) (p(-2u) 2 

W 2 4u 2 \ v (y5 2 (lt) ip 2 (—u) (p(u)tp(—u) 

1 - iG~ 2 (r - 2)(y - 3)u 2 + 0{u 4 ). 
o 

Since G < and < y < 2, the coefficient of u 2 is negative. Hence, by choosing u 
sufficiently small -|j < 1 can be achieved. □ 

Numerical investigations show that for a CGMY process with parameters C = 
0.564, G = M = l,y = 0.5 the minimal value of g is approximately 0.73. We 
experienced that increasing the number of u-s that are used does not reduce s/w 2 
significantly. For example, choosing (ui, . . . , Ufc) = (0.1,0.2, O.lfc) and K = I 
we get s/w 2 = 0.688. 

9 Discussion 

In the previous section we assumed that E [AZ n ] = 0. This is a standard assump- 
tion in system identification, but certainly not realistic for financial data. Thus e.g. 
in the case of a CGMY process this assumption would imply G = M, excluding 
possible skewness in the distribution. While the case E [AZ n ] = m* ^ would pose 
no problem for the case of i.i.d. data, surprisingly the single term ECF method 
may break down. The reason for this is that V^g(0,ri) is no more a score function, 
since we cannot guarantee that 




E[v£ e (e*,r,)] =0 
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holds, see Lemma[T] Namely in the proof of Lemma[T]we make use of the equality 

E [e* n0 (e*)) = 0, (51) 
which may not be valid. Note, however, that 

h n (u;9,ri) = e tU£n (e) - <p(u, rf) 
does have the property required for a score function, namely 

E [e me " (r) -^,77*)] =0. (52) 



Thus, using an instrumental variable approach, we may choose an appropriate 
linear combination of these score functions, say ^2 n=l Mh n {0, rj), where M is a 
(p + r) x k matrix, and consider the equation: 



N 



Y J Mh n (6,ri)=0. 



Assuming that p + r < k we may rightly expect that taking mathematical expecta- 
tion the resulting equation has (#*, 77*) as an isolated solution, and we may proceed 
as in Section 5. The elaboration of the details is the subject of ongoing research. 

An alternative approach is to adapt our method of combining the PE method 
with the ECF method. For this we first need to extend the PE method to deal 
with the case m* ^ 0, which is a standard exercise. Write AZ n = Ae n + m* , where 
E [Ae n ] = 0. Then equation (112[) reads as 

Ay n = A(0*) (Ae n + m*) . 

Define the estimated innovation process by 

e„(0) =A- 1 (8)A(8*)(Ae n + m*). 

Clearly E [e n (#*)] = m* , thus we define the cost function via 

N 



V N (8,m) = ±J2(e n (8)-m) 2 



n=l 

The estimate (#/v,m/v) of (6* ,m*) is obtained by solving 
which can be written as 

N 

= V N g{6, m) = J2 ( £ » W - m ) £ ne(0) 

71=1 

N 



= V Nm (8,m) = -J2 M*) - m) . 



Having estimated the system dynamics with this extended PE method one may 
estimate the noise parameters with the ECF method, as in Section [7] 
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The shortcoming of the above approach is that it does not exploit fully the 
potentials of the ECF method in estimating the system dynamics. Therefore we 
suggest a second pass for estimating 8* via a single term ECF method, with fj^ 
considered as the true parameter, applied to the system 

Ay n - A(6 N )m N = A(9*)(AZ„ - rh N ), (53) 

where rh^ ,0n ,"<1n are the first estimates. Define Ay n = Ay n — A{6^)rh^ and AZ n 
the previous equation reads as 

Ay n = A{6*)AZ n , (54) 

with E [AZ n ] = 0. Thus we may proceed according to Section [5] to obtain the 
corrected estimate of 9*. 

What we have obtained is an extension of the single term ECF method, which 
is computationally simpler. Ongoing investigations suggest that the efficiency of 
this generalized single term ECF method is as good as the original single term 
ECF when m* = 0. 

Finally we mention one more very different approach to deal with the problem 
of non-zero expectation, having interest on its own. The idea is to use an ECF 
method directly for blocks of unprocessed data, i.e. for blocks of the time series 
(yn)- For this purpose let us imbed our data into the class of time series 

Ay n {6,rf) = A{0)AZn{v)- 

Note that for (9, rf) = (9* , r]*) we recover (in a statistical sense) our observed data. 
Fix a block length, say r, and define the r-dimensional blocks 

AYZ(9, rf) = (Ay n (9, r,),..., Ay n +r-i(9, ij)). 

Letting U be an arbitrary r- vector the characteristic function of AYn(9, rf) is given 
by 

and the corresponding score function will be defined as 

h n (U, 9, 77) = e iuTAY " - <p n (U, 9, 77). 

The point is that the characteristic function can be explicitly computed, at least 
in theory, as 

MU,9,v) = E [exp{iC/ T Z»£(M)}] = 

{r 00 
j=i 1=0 



(55) 



n vAzwivjie)), 

with some ^-dependent constants vj. Here paz denotes the characteristic func- 
tion of AZ\ (77) . The weakness of this approach is that the characteristic function 
</?«([/, 9, rf) is given in terms of an infinite product, therefore it is not clear how to 
use it in actual computations. 
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10 Appendix 

Let 9 be a d-dimensional parameter vector. 

Definition 2 We say that x n (0) is M-bounded of order Q if for all 1 < q < Q, 

M q Q (x) = sup E 1/q \x n (6)\ q <oo 
n>o,eeD 

Define & n = a {e, : i < n} and = °~ { e i ■ i > n) where e^-s are i.i.d. random 
variables. 

Definition 3 We say that a stochastic process [x n {9)) is L-mixing of order Q 
with respect to (^ n ,^„) uniformly in 6 if it is & n progressively measurable, 
M-bounded of order Q with any positive r and 

Jq(r,x)=j q (r)= sup E 1/q \x n {0)~E[x n {e)\,?+_ r }\ q , 

n>r,6ED 

we have for any 1 < q < Q, 

oo 

r q( x ) = ^Jqir) < °°- 

r=l 

Theorem 7 Let (u n ), n > be an L-mixing process of order Q with ~Eu n = for all 
n, and let (fn) be a deterministic sequence. Then we have for all 1 < m < Q/2, 



E 1/(2 TO ) J2f nUn 



2m / N \ 1/2 

n=l ) 



n=l 

where C m = 2{2m - 1) 1/2 . 
Define 

Ax/A a 6 = \x n (9 + h) - x„(6)\/ \h\ a 
for n>0,6=£6 + h<=D with < a < 1. 

Definition 4 We say that x n {9) is M-H61der continuous of order Q in 6 with 
exponent a if the process Axj A a Q is M-bounded of order Q. 

Now let us suppose that (x n (9)) is measurable, separable, Af-bounded of order 
Q and Af-H61der of order Q in 6 with exponent a for £ D. The realizations of 
(x n (0)) are continuous in almost surely hence 

Xn = max |x rl (6')| 
6£D 

is well defined for almost all ui, where Do C int D is a compact domain. Since the 
realizations of (x n {9)) are continuous, x* n is measurable with respect to J 7 . 

Theorem 8 Assume that (x n (8)) is measurable, separable, M-bounded of order Q and 
M -Holder of order Q in 6 with exponent a for 9 £ D. Then we have for all positive 
q < Qa/s and p/a < s < Q/q, 

M q (x*) <C(M qs (x) + M qs {Ax/A a 9)) 

where C depends only on p, q, s, a and Dq, D. 
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Choosing f n = 1 and a = 1 and using Theorem [7] and [8] we obtain 

Theorem 9 Let (u n (8)) be an L-mixing of order Q uniformly in 8 £ D such that 
~Eu n (8) = for all n > 0,8 G D, and assume that Au/Ad is also L-mixing of order Q, 
uniformly in 8,8 + h G D. Then 



sup 

0£D„ 



1 N 



o 



Q/p 

M 



(57) 



Theorem 10 Let D Q and D be as above. Let Wg(8),SWg(8), 8 G D C M. p be Re- 
valued continuously differ entiable functions, let for some 8* G Dq, Wg{8*) = 0, and let 
Wgg(8*) be nonsingular. Then for any d > there exists positive numbers d',d" such 
that 

\&W g {6)\ < d! and \\8Wgg{8)\\ < d" (58) 

for all 8 G Do implies that the equation Wg (8) + 8Wg (8) = has exactly one solution 
in a neighborhood of radius d of 8* . 
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